Book Review: Linguistic Structure Prediction by Noah A. Smith
نویسنده
چکیده
Noah Smith's ambitious new monograph, Linguistic Structure Prediction, " aims to bridge the gap between natural language processing and machine learning. " Given that current natural language processing (NLP) research makes heavy demands on machine-learning techniques, and a sizeable fraction of modern machine learning (ML) research focuses on structure prediction, this is clearly a timely and important topic. To address the gaps and overlaps between these two large and well-developed fields in five brief chapters is a difficult feat. The text, though not without its flaws, does an admirable job of building this bridge. An introductory first chapter surveys current research areas in statistical NLP, cataloging and defining many common linguistic structure prediction tasks. Machine learning students new to the area are likely to find this helpful albeit a bit terse; NLP students will likely consider this section primarily a review. The subsequent chapters change character abruptly, delving into mathematical details and heavy formalism. Chapter 2 introduces the concept of decoding, presenting five distinct viewpoints on the search for the highest scoring structure. The reader is quickly ushered through graphical models, polytopes, grammars, hypergraphs, and weighted deduction systems , with descriptions based on an example in sequence tagging. The broad coverage, multi-viewpoint discussion encourages the reader to make connections between many distinct approaches, and provides solid formalism for reasoning about decoding problems. It is a comprehensive introduction to the most common and effective decoding approaches, with one significant exception: the recent advances in dual decomposition and Lagrangian relaxation methods. Timing is likely the culprit. This book was developed mainly from 2006 to 2009, whereas dual decomposition did not attain notoriety in our community until a few years later (Rush et al. 2010). Relaxation approaches, though potentially a passing phase, have successfully broadened the reach of simpler decoding techniques into more complicated domains such as structured event extraction. They would have made a nice addition. Regardless, this second chapter equips the reader with sufficient machinery to solve a number of structured prediction problems. Chapter 3 applies the machinery described in the prior chapter to the problem of supervised structure induction. Probabilistic generative and conditional models are introduced in some detail, followed by a discussion of margin-based methods. Hidden Markov models (HMMs) and probabilistic context-free grammars are introduced in detail, followed by solid descriptions of maximum likelihood estimation and smoothing. The section on conditional models is well written and crucial, because so …
منابع مشابه
Linguistic Structure Prediction
A major part of natural language processing now depends on the use of text data to build linguistic analyzers. We consider statistical, computational approaches to modeling linguistic structure. We seek to unify across many approaches and many kinds of linguistic structures. Assuming a basic understanding of natural language processing and/or machine learning, we seek to bridge the gap between ...
متن کاملUnsupervised Structure Prediction with Non-Parallel Multilingual Guidance
We describe a method for prediction of linguistic structure in a language for which only unlabeled data is available, using annotated data from a set of one or more helper languages. Our approach is based on a model that locally mixes between supervised models from the helper languages. Parallel data is not used, allowing the technique to be applied even in domains where human-translated texts ...
متن کاملLinguistic Structured Sparsity in Text Categorization
We introduce three linguistically motivated structured regularizers based on parse trees, topics, and hierarchical word clusters for text categorization. These regularizers impose linguistic bias in feature weights, enabling us to incorporate prior knowledge into conventional bagof-words models. We show that our structured regularizers consistently improve classification accuracies compared to ...
متن کاملBook Review: 'Ecolinguistics: Language and ecology'
Ecolinguistics: language and ecology delivers an overall view and a critical approach on ecolinguistic studies. This book is an excellent resource to students, researchers, linguists and those working in the area of discourse analysis as well as ecology. The book claims presenting a news course for ecolinguistics including a framework for understanding the theory of ecolinguistics, exploration ...
متن کاملNoah A. Smith
Modulo formatting, this document constitutes a portion of my application for tenure (“section 5”). It details my research, teaching, and service efforts and goals. It is not exhaustive, and it reflects my view of my activities in late May 2013. 1 Research My research goal is to automate inference from natural language text, including: • algorithms that interpret text into abstract linguistic st...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012